Introduction

Article & Datasets

  • Nature article: Mora et al., 2019: Space Station conditions are selective but do not alter microbial characteristics relevant to human health
  • Two sets of stacked bargraphs (fig 1 & 4), describing microbial diversity
  • Grouped by sampling sessions (A,B,C)
  • No. reads per unique sample, stratified by Taxonomic classification (Domain, Phylum, Genus)

SD1

  • Supplementary data 1 (fig 1): RSV table (Ribosomal Sequence Variants)
RSV ISSCapoA1 ISSCapoA2 ISSCapoA3 ISSCapoA4 ISSCapoA5 ISSCapoA6 ISSCapoA7 ISSCapoA8 ISSCapoA9 ISSCapoC1 ISSCapoC2 ISSCapoC3 ISSCapoC4 ISSCapoC5 ISSCapoC6 ISSCapoC7 ISSCapoB1 ISSCapoB2 ISSCapoB3 ISSCapoB4 ISSCapoB5 ISSCapoB6 ISSCapoB7 ISSCapoB8
RSV_0001 0 0 0 0 0 0 0 0 0 0 10 0 0 160 0 0 0 0 0 31 1080 0 0 0
RSV_0002 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0
RSV_0003 0 0 0 74 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Taxonomic classification
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Ruminococcaceae;gFaecalibacterium;s
k__Bacteria;p__Cyanobacteria;c__Melaibacteria;oGastraerophilales;f;g;s
k__Bacteria;p__Planctomycetes;c__Phycisphaerae;oWD2101_soil_group;f;g;s

SD2

  • Supplementary data 2 (fig 4): Taxonomic diversity inferred from metagenomic dataset
    domain phylum className order family genus
    Viruses unclassified (derived from Viruses) unclassified (derived from Viruses) Caudovirales Podoviridae AHJD-like viruses
    Bacteria Firmicutes Bacilli Lactobacillales Aerococcaceae Abiotrophia
COLA1 COLB1 N2A N2B N3C1 N1C
11 130 2 25 32 33
17 898 10 98 102 119
0 1 0 0 0 0

Table 1

  • Table 1 (tidy): sampled surfaces, locations, Wide ID and session ID
    Sampled_surface ISS_module Wipe Session
    Ambient air (field blank, FB) Columbus A5 A
    Ambient air (field blank, FB) Columbus A5 B
    Ambient air (field blank, FB) Columbus B1 A
    Ambient air (field blank, FB) Columbus B1 B
    Light covers Columbus A4 A
    Light covers Columbus A4 B
    Light covers Columbus B2 A
    Light covers Columbus B2 B
    SCC laptop Columbus A2 A
    SCC laptop Columbus A2 B

Data Processing Workflow

Results & Discussion

Figure 1: Domain

Figure 1: Phylum

Figure 1: Genus (our plot)

Figure 1: Genus (their plot)

Figure 1 Discussion

  • The Domain and Phylum plot can be exactly reproduced.
  • Domain: Archaea is present in all 3 sessions.
  • Phylum: 3 Archaea species (Woesearchaeota, Thaumarchaeota and Euryarchaeota) detected in Cupola air (C) and Columbus air (B)
  • Genus:
    • Acinetobacteria is predominant in Columbus air for session A and B, whereas Streptococcus is predominant in session C
    • Not 100% reproducible
      • No access to clean room
      • Bars don’t close to 100%

Figure 4: Domain

Figure 4: Phylum

Figure 4: Genus (our plot)

Figure 4: Genus (their plot)

Figure 4 Discussion

  • Our method of selection: Sort by top 200: arrange(desc(count)) %>% top_n(200)
  • Domain: ‘other sequences’ not visible in out plot, but there is only 1 read
    domain n
    Archaea 59
    Bacteria 592
    Eukaryota 390
    other sequences 1
    Viruses 73
  • Detected sequences of Thaumarchaeota, Nanoarchaeota, Euryarchaeota archaeal signature reflect the type of surface
  • Predominance of Propionibacterium reads on all the sample sessions
  • ISS microbiome characterised by abundance of human associated Staphylococcus, Corynebacterium and Streptococcus

Conclusion

Conclusion

  • This study reveals a problem with data reproducibility

  • Further perspectives & improvements

    • Obtaining the clean room data could completely change figure 1
    • Automate TSS with purrr() and SD1 column-renaming by referencing Table 1
    mutate(ISSCapoA1 = ISSCapoA1/sum(ISSCapoA1),
           ISSCapoA2 = ISSCapoA2/sum(ISSCapoA2),
           ISSCapoA3 = ISSCapoA3/sum(ISSCapoA3)
    • We tried to recreate the PCoA plot, but the overwhelming amount of ‘zeros’ made it unfeasible